我们提出了一个新颖的半监督学习框架,该框架巧妙地利用了模型的预测,从两个强烈的图像观点中的预测之间的一致性正则化,并由伪标签的信心加权,称为conmatch。虽然最新的半监督学习方法使用图像的弱和强烈的观点来定义方向的一致性损失,但如何为两个强大的观点之间的一致性定义定义这种方向仍然没有探索。为了解决这个问题,我们通过弱小的观点作为非参数和参数方法中的锚点来提出从强大的观点中对伪标签的新颖置信度度量。特别是,在参数方法中,我们首次介绍了伪标签在网络中的信心,该网络的信心是以端到端方式通过骨干模型学习的。此外,我们还提出了阶段训练,以提高培训的融合。当纳入现有的半监督学习者中时,并始终提高表现。我们进行实验,以证明我们对最新方法的有效性并提供广泛的消融研究。代码已在https://github.com/jiwoncocoder/conmatch上公开提供。
translated by 谷歌翻译
训练与人交往的机器人具有挑战性。直接让人们参与培训过程是昂贵的,这需要大量的数据样本。本文提出了解决此问题的另一种方法。我们提出了一个人类路径预测网络(HPPN),该网络基于连续的神经网络结构来基于连续机器人动作和人类响应生成用户的未来轨迹。随后,出现了一种基于进化 - 策略的机器人训练方法,仅使用使用HPPN生成的虚拟人类运动。证明我们提出的方法允许对视力受损的人进行机器人指南的样品培训。通过仅收集来自真实用户的1.5 K剧集,我们能够训练HPPN并生成训练机器人所需的100 k个虚拟剧集。训练有素的机器人精确地沿着目标路径蒙住眼睛。此外,使用虚拟情节,我们研究了一种新的奖励设计,该设计在机器人的指导中优先考虑人类的舒适性,而不会产生额外的费用。预计这种样品效率的训练方法将广泛适用于未来与人体互动的机器人。
translated by 谷歌翻译
In this paper, we propose a diffusion-based face swapping framework for the first time, called DiffFace, composed of training ID conditional DDPM, sampling with facial guidance, and a target-preserving blending. In specific, in the training process, the ID conditional DDPM is trained to generate face images with the desired identity. In the sampling process, we use the off-the-shelf facial expert models to make the model transfer source identity while preserving target attributes faithfully. During this process, to preserve the background of the target image and obtain the desired face swapping result, we additionally propose a target-preserving blending strategy. It helps our model to keep the attributes of the target face from noise while transferring the source facial identity. In addition, without any re-training, our model can flexibly apply additional facial guidance and adaptively control the ID-attributes trade-off to achieve the desired results. To the best of our knowledge, this is the first approach that applies the diffusion model in face swapping task. Compared with previous GAN-based approaches, by taking advantage of the diffusion model for the face swapping task, DiffFace achieves better benefits such as training stability, high fidelity, diversity of the samples, and controllability. Extensive experiments show that our DiffFace is comparable or superior to the state-of-the-art methods on several standard face swapping benchmarks.
translated by 谷歌翻译
For change detection in remote sensing, constructing a training dataset for deep learning models is difficult due to the requirements of bi-temporal supervision. To overcome this issue, single-temporal supervision which treats change labels as the difference of two semantic masks has been proposed. This novel method trains a change detector using two spatially unrelated images with corresponding semantic labels such as building. However, training on unpaired datasets could confuse the change detector in the case of pixels that are labeled unchanged but are visually significantly different. In order to maintain the visual similarity in unchanged area, in this paper, we emphasize that the change originates from the source image and show that manipulating the source image as an after-image is crucial to the performance of change detection. Extensive experiments demonstrate the importance of maintaining visual information between pre- and post-event images, and our method outperforms existing methods based on single-temporal supervision. code is available at https://github.com/seominseok0429/Self-Pair-for-Change-Detection.
translated by 谷歌翻译
In recent years, generative models have undergone significant advancement due to the success of diffusion models. The success of these models is often attributed to their use of guidance techniques, such as classifier and classifier-free methods, which provides effective mechanisms to trade-off between fidelity and diversity. However, these methods are not capable of guiding a generated image to be aware of its geometric configuration, e.g., depth, which hinders the application of diffusion models to areas that require a certain level of depth awareness. To address this limitation, we propose a novel guidance approach for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models. To do this, we first present a label-efficient depth estimation framework using the internal representations of diffusion models. At the sampling phase, we utilize two guidance techniques to self-condition the generated image using the estimated depth map, the first of which uses pseudo-labeling, and the subsequent one uses a depth-domain diffusion prior. Experiments and extensive ablation studies demonstrate the effectiveness of our method in guiding the diffusion models toward geometrically plausible image generation. Project page is available at https://ku-cvlab.github.io/DAG/.
translated by 谷歌翻译
Deep learning-based weather prediction models have advanced significantly in recent years. However, data-driven models based on deep learning are difficult to apply to real-world applications because they are vulnerable to spatial-temporal shifts. A weather prediction task is especially susceptible to spatial-temporal shifts when the model is overfitted to locality and seasonality. In this paper, we propose a training strategy to make the weather prediction model robust to spatial-temporal shifts. We first analyze the effect of hyperparameters and augmentations of the existing training strategy on the spatial-temporal shift robustness of the model. Next, we propose an optimal combination of hyperparameters and augmentation based on the analysis results and a test-time augmentation. We performed all experiments on the W4C22 Transfer dataset and achieved the 1st performance.
translated by 谷歌翻译
Traditional weather forecasting relies on domain expertise and computationally intensive numerical simulation systems. Recently, with the development of a data-driven approach, weather forecasting based on deep learning has been receiving attention. Deep learning-based weather forecasting has made stunning progress, from various backbone studies using CNN, RNN, and Transformer to training strategies using weather observations datasets with auxiliary inputs. All of this progress has contributed to the field of weather forecasting; however, many elements and complex structures of deep learning models prevent us from reaching physical interpretations. This paper proposes a SImple baseline with a spatiotemporal context Aggregation Network (SIANet) that achieved state-of-the-art in 4 parts of 5 benchmarks of W4C22. This simple but efficient structure uses only satellite images and CNNs in an end-to-end fashion without using a multi-model ensemble or fine-tuning. This simplicity of SIANet can be used as a solid baseline that can be easily applied in weather forecasting using deep learning.
translated by 谷歌翻译
This paper proposes a graph-based approach to representing spatio-temporal trajectory data that allows an effective visualization and characterization of city-wide traffic dynamics. With the advance of sensor, mobile, and Internet of Things (IoT) technologies, vehicle and passenger trajectories are being increasingly collected on a massive scale and are becoming a critical source of insight into traffic pattern and traveller behaviour. To leverage such trajectory data to better understand traffic dynamics in a large-scale urban network, this study develops a trajectory-based network traffic analysis method that converts individual trajectory data into a sequence of graphs that evolve over time (known as dynamic graphs or time-evolving graphs) and analyses network-wide traffic patterns in terms of a compact and informative graph-representation of aggregated traffic flows. First, we partition the entire network into a set of cells based on the spatial distribution of data points in individual trajectories, where the cells represent spatial regions between which aggregated traffic flows can be measured. Next, dynamic flows of moving objects are represented as a time-evolving graph, where regions are graph vertices and flows between them are treated as weighted directed edges. Given a fixed set of vertices, edges can be inserted or removed at every time step depending on the presence of traffic flows between two regions at a given time window. Once a dynamic graph is built, we apply graph mining algorithms to detect change-points in time, which represent time points where the graph exhibits significant changes in its overall structure and, thus, correspond to change-points in city-wide mobility pattern throughout the day (e.g., global transition points between peak and off-peak periods).
translated by 谷歌翻译
Traversability estimation for mobile robots in off-road environments requires more than conventional semantic segmentation used in constrained environments like on-road conditions. Recently, approaches to learning a traversability estimation from past driving experiences in a self-supervised manner are arising as they can significantly reduce human labeling costs and labeling errors. However, the self-supervised data only provide supervision for the actually traversed regions, inducing epistemic uncertainty according to the scarcity of negative information. Negative data are rarely harvested as the system can be severely damaged while logging the data. To mitigate the uncertainty, we introduce a deep metric learning-based method to incorporate unlabeled data with a few positive and negative prototypes in order to leverage the uncertainty, which jointly learns using semantic segmentation and traversability regression. To firmly evaluate the proposed framework, we introduce a new evaluation metric that comprehensively evaluates the segmentation and regression. Additionally, we construct a driving dataset `Dtrail' in off-road environments with a mobile robot platform, which is composed of a wide variety of negative data. We examine our method on Dtrail as well as the publicly available SemanticKITTI dataset.
translated by 谷歌翻译
This work presents six structural quality metrics that can measure the quality of knowledge graphs and analyzes five cross-domain knowledge graphs on the web (Wikidata, DBpedia, YAGO, Google Knowledge Graph, Freebase) as well as 'Raftel', Naver's integrated knowledge graph. The 'Good Knowledge Graph' should define detailed classes and properties in its ontology so that knowledge in the real world can be expressed abundantly. Also, instances and RDF triples should use the classes and properties actively. Therefore, we tried to examine the internal quality of knowledge graphs numerically by focusing on the structure of the ontology, which is the schema of knowledge graphs, and the degree of use thereof. As a result of the analysis, it was possible to find the characteristics of a knowledge graph that could not be known only by scale-related indicators such as the number of classes and properties.
translated by 谷歌翻译